CRISP-DM Process

  1. Business Understanding

  2. Data Understanding

  3. Prepare Data

  4. Data Modeling

  5. Evaluate the Results

  6. Deploy

I- Business understanding

Who can host on Airbnb? Behind every stay is a host, a real person who can give you the details you need to check in and feel at home. They can interact with guests in different ways, depending on the type of place or experience they booked lmost anyone can be a host. It's free to sign up and list both stays and experiences. Whether they’re hosting a place to stay or a local activity, all hosts are expected to meet our quality standards every time (link)

so we try to figure out the below issues

a. the vairance of price across specific period which data set include after removing some outlier
b. also trying to make scheme for correlation between parameters
c. making a trail to analysis the text and comments of customer to know little bit about what customer need to know
d. predict the price of unit based on two model and evalute our models

I.I question neeed to ansewrs

what the most major price reange required ?
what is the factors affect the price ?
what are the comments of the guest they would to say ?
what the factors affects the price ?

Rigth skewed

Removing Outlier

price of major unit lies between 20 to 500 and others is outliers

After Removing Of Outlier

above histogram of price show tendancy to right skewed means the higher price mean less hosting times and hosting increase by the less of price

Listings

=============================================================

studing the correlation between numerical parameters

2.1 analysis of summary column by tools on NLP by measuring

1- measuring subjectivity & polarity
2- measuring the most repeated words

3-prediction of price according to different parameter

3.1 linear Regression

Final Outcomes

used two model to predcit the price based on some numerical and Categorical Varibales ; these two model are
1- linear Regression
2- Random Forest Regressor
and plots show thr result from Random Forest Regressor more accurate and able to get high score for both test and train set